One Year of Contender: What Have We Learned about Assessing and Tuning Industrial Spoken Dialog Systems?

نویسندگان

  • David Suendermann-Oeft
  • Roberto Pieraccini
چکیده

A lot. Since inception of Contender, a machine learning method tailored for computerassisted decision making in industrial spoken dialog systems, it was rolled out in over 200 instances throughout our applications processing nearly 40 million calls. The net effect of this data-driven method is a significantly increased system performance gaining about 100,000 additional automated calls every month. 1 From the unwieldiness of data to the Contender process Academic institutions involved in the research on spoken dialog systems often lack access to data for training, tuning, and testing their systems. This is simply because the majority of systems only live in laboratory environments and hardly get deployed to the live user. The lack of data can result in systems not sufficiently tested, models trained on nonrepresentative or artificial data, and systems of limited domains (usually restaurant or flight information). On the other hand, in industrial settings, spoken dialog systems are often deployed to take over tasks of call center agents associated with potentially very large amounts of traffic. Here, we are speaking of applications which may process more than one million calls per week. Having applications log every One of the few exceptions to this rule is the Let’s Go bus information system maintained at the Carnegie Mellon University in Pittsburgh (Raux et al., 2005). action they take during the course of a call can provide developers with valuable data to tune and test the systems they maintain. As opposed to the academic world, often, there appears to be too much data to capture, permanently store, mine, and retrieve. Harddisks on application servers run full, log processing scripts demand too much computing capacity, database queues get stuck, queries slow down, and so on and so forth. Even if these billions and billions of log entries are eventually available for random access from a highly indexed database cluster, it is not clear what one should search for in an attempt to improve a dialog system’s performance. About a year and a half ago, we proposed a method we called Contender playing the role of a live experiment in a deployed spoken dialog system (Suendermann et al., 2010a). Conceptually, a Contender is an activity in a call flow which has an input transition and multiple output transitions (alternatives). When a call hits a Contender’s input transition, a randomization is carried out to determine which alternative the call will continue with (see Figure 1). The Contender itself does not do anything else but performing the random decision during runtime. The different call flow activities and processes the individual alternatives get routed to make calls depend on the Contenders’ decisions. Say, one wants to find out which of ten possible time-out settings in an activity is optimal. This could be achieved by duplicating the activity in question ten times and setting each copy’s time-out to a different value. Now, a Contender is placed whose ten alternatives get connected to the ten competing ac-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large-Scale Experiments on Data-Driven Design of Commercial Spoken Dialog Systems

The design of commercial spoken dialog systems is most commonly based on hand-crafting call flows. Voice interaction designers write prompts, predict caller responses, set speech recognition parameters, implement interaction strategies, all based on “best design practices”. Recently, we presented the mathematical framework “Contender” (similar to reinforcement learning) that allows for replacin...

متن کامل

"do Not Attempt to Light with Match!": Some Thoughts on Progress and Research Goals in Spoken Dialog Systems

In view of the current market consolidation in the speech recognition industry, we ask some questions as to what constitutes the ideas underlying the ‘roadmap’ metaphor. These questions challenge the traditional faith in ever more complex and ‘natural’ systems as the ultimate goals and keys to full commercial success of Spoken Dialog Systems. As we strictly obey that faith, we consider those qu...

متن کامل

How to Drink from a Fire Hose: One Person Can Annoscribe One Million Utterances in One Month

. Transcription and semantic annotation (annoscription) of utterances is crucial part of speech performance analysis and tuning of spoken dialog systems and other natural language processing disciplines. However, the fact that these are manual tasks makes them expensive and slow. In this paper, we will discuss how annoscription can be partially automated. We will show that annoscription can rea...

متن کامل

Embedded Wizardry

This paper presents a progressively challenging series of experiments that investigate clarification subdialogues to resolve the words in noisy transcriptions of user utterances. We focus on user utterances where the user’s specific intent requires little additional inference, given sufficient understanding of the form. We learned decision-making strategies for a dialogue manager from run-time ...

متن کامل

Sorry, I Didn’t Catch That! – An Investigation of Non-understanding Errors and Recovery Strategies

We present results from an empirical analysis of non-understanding errors and ten nonunderstanding recovery strategies, based on a corpus of dialogs collected with a spoken dialog system that handles conference room reservations. More specifically, the issues under investigation are: what are the main sources of non-understanding errors? What is the impact of these errors on global performance?...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012